83 research outputs found

    Methods of Optimizing Speech Enhancement for Hearing Applications

    Get PDF
    Speech intelligibility in hearing applications suffers from background noise. One of the most effective solutions is to develop speech enhancement algorithms based on the biological traits of the auditory system. In humans, the medial olivocochlear (MOC) reflex, which is an auditory neural feedback loop, increases signal-in-noise detection by suppressing cochlear response to noise. The time constant is one of the key attributes of the MOC reflex as it regulates the variation of suppression over time. Different time constants have been measured in nonhuman mammalian and human auditory systems. Physiological studies reported that the time constant of nonhuman mammalian MOC reflex varies with the properties (e.g. frequency, bandwidth) changes of the stimulation. A human based study suggests that time constant could vary when the bandwidth of the noise is changed. Previous works have developed MOC reflex models and successfully demonstrated the benefits of simulating the MOC reflex for speech-in-noise recognition. However, they often used fixed time constants. The effect of the different time constants on speech perception remains unclear. The main objectives of the present study are (1) to study the effect of the MOC reflex time constant on speech perception in different noise conditions; (2) to develop a speech enhancement algorithm with dynamic time constant optimization to adapt to varying noise conditions for improving speech intelligibility. The first part of this thesis studies the effect of the MOC reflex time constants on speech-in-noise perception. Conventional studies do not consider the relationship between the time constants and speech perception as it is difficult to measure the speech intelligibility changes due to varying time constants in human subjects. We use a model to investigate the relationship by incorporating Meddis’ peripheral auditory model (which includes a MOC reflex) with an automatic speech recognition (ASR) system. The effect of the MOC reflex time constant is studied by adjusting the time constant parameter of the model and testing the speech recognition accuracy of the ASR. Different time constants derived from human data are evaluated in both speech-like and non-speech like noise at the SNR levels from -10 dB to 20 dB and clean speech condition. The results show that the long time constants (≥1000 ms) provide a greater improvement of speech recognition accuracy at SNR levels≤10 dB. Maximum accuracy improvement of 40% (compared to no MOC condition) is shown in pink noise at the SNR of 10 dB. Short time constants (<1000 ms) show recognition accuracy over 5% higher than the longer ones at SNR levels ≥15 dB. The second part of the thesis develops a novel speech enhancement algorithm based on the MOC reflex with a time constant that is dynamically optimized, according to a lookup table for varying SNRs. The main contributions of this part include: (1) So far, the existing SNR estimation methods are challenged in cases of low SNR, nonstationary noise, and computational complexity. High computational complexity would increase processing delay that causes intelligibility degradation. A variance of spectral entropy (VSE) based SNR estimation method is developed as entropy based features have been shown to be more robust in the cases of low SNR and nonstationary noise. The SNR is estimated according to the estimated VSE-SNR relationship functions by measuring VSE of noisy speech. Our proposed method has an accuracy of 5 dB higher than other methods especially in the babble noise with fewer talkers (2 talkers) and low SNR levels (< 0 dB), with averaging processing time only about 30% of the noise power estimation based method. The proposed SNR estimation method is further improved by implementing a nonlinear filter-bank. The compression of the nonlinear filter-bank is shown to increase the stability of the relationship functions. As a result, the accuracy is improved by up to 2 dB in all types of tested noise. (2) A modification of Meddis’ MOC reflex model with a time constant dynamically optimized against varying SNRs is developed. The model incudes simulated inner hair cell response to reduce the model complexity, and now includes the SNR estimation method. Previous MOC reflex models often have fixed time constants that do not adapt to varying noise conditions, whilst our modified MOC reflex model has a time constant dynamically optimized according to the estimated SNRs. The results show a speech recognition accuracy of 8 % higher than the model using a fixed time constant of 2000 ms in different types of noise. (3) A speech enhancement algorithm is developed based on the modified MOC reflex model and implemented in an existing hearing aid system. The performance is evaluated by measuring the objective speech intelligibility metric of processed noisy speech. In different types of noise, the proposed algorithm increases intelligibility at least 20% in comparison to unprocessed noisy speech at SNRs between 0 dB and 20 dB, and over 15 % in comparison to processed noisy speech using the original MOC based algorithm in the hearing aid

    Career barriers of hospitality and tourism management students and the impacts on their career intention

    Get PDF
    This study constructs a three-dimensions of perceived career barriers (CB) of hospitality and tourism management (HTM) students, namely personal, social and interactional career barriers, and explores their impacts on students’ professional identity and intention to work in hospitality and tourism (H&T) industry. The findings based on a sample of 842 HTM students in mainland China are as follows. Firstly, the three-dimensions model could reveal the structure of HTM students’ perceived career barriers and all dimensions have significantly negative effects on professional identity and career intention. Meanwhile, the predictive power of personal career barriers is strongest, interactional and social barriers followed. Secondly, students’ professional identity plays a role as a mediator between career barriers and intention. Lastly, the barriers could be negotiated by major satisfaction, as it moderates the relationship of career barriers to intention partially. Managerial implications are also discussed for tourism industries and educators

    FedPrompt: Communication-Efficient and Privacy Preserving Prompt Tuning in Federated Learning

    Full text link
    Federated learning (FL) has enabled global model training on decentralized data in a privacy-preserving way by aggregating model updates. However, for many natural language processing (NLP) tasks that utilize pre-trained language models (PLMs) with large numbers of parameters, there are considerable communication costs associated with FL. Recently, prompt tuning, which tunes some soft prompts without modifying PLMs, has achieved excellent performance as a new learning paradigm. Therefore we want to combine the two methods and explore the effect of prompt tuning under FL. In this paper, we propose "FedPrompt" as the first work study prompt tuning in a model split learning way using FL, and prove that split learning greatly reduces the communication cost, only 0.01% of the PLMs' parameters, with little decrease on accuracy both on IID and Non-IID data distribution. This improves the efficiency of FL method while also protecting the data privacy in prompt tuning.In addition, like PLMs, prompts are uploaded and downloaded between public platforms and personal users, so we try to figure out whether there is still a backdoor threat using only soft prompt in FL scenarios. We further conduct backdoor attacks by data poisoning on FedPrompt. Our experiments show that normal backdoor attack can not achieve a high attack success rate, proving the robustness of FedPrompt.We hope this work can promote the application of prompt in FL and raise the awareness of the possible security threats

    Mechanical Deformation Induced Continuously Variable Emission for Radiative Cooling

    Full text link
    Passive radiative cooling drawing the heat energy of objects to the cold outer space through the atmospheric transparent window (8 um - 13 um) is significant for reducing the energy consumption of buildings. Daytime and nighttime radiative cooling have been extensively investigated in the past. However, radiative cooling which can continuously regulate its cooling temperature, like a valve, according to human need is rarely reported. In this study, we present a concept of reconfigurable photonic structure for the adaptive radiative cooling by continuously varying the emission spectra in the atmospheric window region. This is realized by the deformation of the one-dimensional PDMS grating and the nanoparticles embedded PDMS thin film when subjected to mechanical strain. The proposed structure reaches different stagnation temperatures under certain strains. A dynamic exchange between two different strains results in the fluctuation of the photonic structure's temperature around a set temperature

    A Fully Implantable Opto-Electro Closed-Loop Neural Interface for Motor Neuron Disease Studies

    Get PDF
    This paper presents a fully implantable closed-loop device for use in freely moving rodents to investigate new treatments for motor neuron disease. The 0.18 µm CMOS integrated circuit comprises 4 stimulators, each featuring 16 channels for optical and electrical stimulation using arbitrary current waveforms at frequencies from 1.5 Hz to 50 kHz, and a bandwidth programmable front-end for neural recording. The implant uses a Qi wireless inductive link which can deliver >100 mW power at a maximum distance of 2 cm for a freely moving rodent. A backup rechargeable battery can support 10 mA continuous stimulation currents for 2.5 hours in the absence of an inductive power link. The implant is controlled by a graphic user interface with broad programmable parameters via a Bluetooth low energy bidirectional data telemetry link. The encapsulated implant is 40 mm × 20 mm × 10 mm. Measured results are presented showing the electrical performance of the electronics and the packaging method
    • …
    corecore